Join predicate push-down optimizations

ABSTRACT

Join predicate push down transformations push down a join predicate of an outer query into a view. Among the types of views for which join predicate push down is performed are a view with a GROUP BY or DISTINCT operator, an anti-joined or semi-joined view, and a view that contains one or more nested views. During optimization, join predicate push down may be used to generate many transformed queries for comparison. The number of query transformations performed for comparison is managed.

RELATED APPLICATIONS

The present application claims priority to U.S. Provisional ApplicationNo. 60/782,785 entitled Cost Based Query Transformation—JoinFactorization And Group By Placement, filed on Mar. 15, 2006 by Hong Su,et al., the entire content of which is hereby incorporated by referencefor all purposes as if fully set forth herein.

The present application is related to U.S. patent application Ser. No.______, attorney docket No. 50277-3144, entitled Efficient Interactionamong Cost-Based Transformations, filed by Rafi Ahmed and Allison Lee,on the equal day herewith, the entire content of which is incorporatedherein by reference.

FIELD OF THE INVENTION

The present invention relates to database systems, and in particular, tooptimization of queries executed by a database system.

BACKGROUND

Relational and object-relational database management systems storeinformation in tables of rows in a database. To retrieve data, queriesthat request data are submitted to a database server, which computes thequeries and returns the data requested.

Query statements submitted to the database server should conform to thesyntactical rules of a particular query language. One popular querylanguage, known as the Structured Query Language (SQL), provides users avariety of ways to specify information to be retrieved.

A query submitted to a database server is evaluated by a queryoptimizer. Based on the evaluation, the query optimizer generates anexecution plan that defines operations for executing the query.Typically, the query optimizer generates an execution plan optimized forefficient execution.

When a query optimizer evaluates a query, it determines various“candidate execution plans” and selects an optimal execution plan. Thequery may be transformed into one or more semantically equivalentqueries. For the query and the one or more of transformed queries,various candidate execution plans are generated.

In general, a query optimizer generates optimized execution plans whenthe query optimizer is able to perform more kinds transformations undermore kinds of conditions. Based on the foregoing, there is clearly aneed for more ways of transforming queries.

The approaches described in this section are approaches that could bepursued, but not necessarily approaches that have been previouslyconceived or pursued. Therefore, unless otherwise indicated, it shouldnot be assumed that any of the approaches described in this sectionqualify as prior art merely by virtue of their inclusion in thissection.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example, and not by wayof limitation, in the figures of the accompanying drawings and in whichlike reference numerals refer to similar elements and in which:

FIG. 1 is a diagram of a query optimizer according to an embodiment ofthe present invention.

FIG. 2 is a diagram of computer system that may be used in animplementation of an embodiment of the present invention.

DETAILED DESCRIPTION

In the following description, for the purposes of explanation, numerousspecific details are set forth in order to provide a thoroughunderstanding of the present invention. It will be apparent, however,that the present invention may be practiced without these specificdetails. In other instances, well-known structures and devices are shownin block diagram form in order to avoid unnecessarily obscuring thepresent invention.

In a join predicate pushdown transformation, a join predicate of anouter query is pushed down into a view. Described herein are noveltransformations for performing join predicate push-down, creating moreways of transforming a query. A query optimizer is able to create morekinds of transformations, creating more possible ways of optimizing aquery. During optimization, there are so many possible transformationsthat could be generated and compared that doing so is too costly. SeeCost Based Query Transformation in Oracle, by Rafi Ahmed, Allison Lee,Andrew Witkowski, Dinesh Das, Hong Su, Mohamed Zait, Thierry Cruanes(32nd International Conference on Very Large Databases, 2006).

Illustrative Operational Environment

FIG. 1 is a diagram depicting a query optimizer and related componentswithin a database server (not shown). Generally, a server, such as adatabase server, is a combination of integrated software components andan allocation of computational resources, such as memory, a node, andprocesses on the node for executing the integrated software components,where the combination of the software and computational resources arededicated to providing a particular type of function on behalf ofclients of the server. A database server governs and facilitates accessto a particular database, processing requests by clients to access thedatabase.

A database comprises data and metadata that is stored on a persistentmemory mechanism, such as a set of hard disks. Such data and metadatamay be stored in a database logically, for example, according torelational and/or object-relational database constructs. Databaseapplications interact with a database server by submitting to thedatabase server commands that cause the database server to performoperations on data stored in a database. A database command may be inthe form of a database statement. For the database server to process thedatabase statements, the database statements must conform to a databaselanguage supported by the database server. One non-limiting databaselanguage supported by many database servers is SQL, includingproprietary forms of SQL supported by such database servers as Oracle,(e.g. Oracle Database 10 g). SQL data definition language (“DDL”)instructions are issued to a database server to create or configuredatabase objects, such as tables, views, or complex types.

Generally, data is stored in a database in one or more data containers,each container contains records, and the data within each record isorganized into one or more fields. In relational database systems, thedata containers are typically referred to as tables, the records arereferred to as rows, and the fields are referred to as columns. Inobject oriented databases, the data containers are typically referred toas object classes, the records are referred to as objects, and thefields are referred to as attributes. Other database architectures mayuse other terminology. Systems that implement the present invention arenot limited to any particular type of data container or databasearchitecture. However, for the purpose of explanation, the examples andthe terminology used herein shall be that typically associated withrelational or object-relational databases. Thus, the terms “table”,“row” and “column” shall be used herein to refer respectively to thedata container, record, and field.

Query Optimizer and Execution Plans

Referring to FIG. 1, query parser 110 receives a query statement QS andgenerates an internal query representation QR of the query statement.Typically, the internal query representation is a set of interlinkeddata structures that represent various components and structures of aquery statement. The internal query representation may be in the form ofa graph of nodes, each interlinked data structure corresponding to anode and to a component of the represented query statement. The internalrepresentation is typically generated in memory for evaluation,manipulation, and transformation by query optimizer 120.

The term query is used herein to refer to any form of representing aquery, including a query in the form of a database statement or in theform of an internal query representation. Query optimizer 120 mayreceive a query from another entity other than query parser 110, wherethe query received is in the form of an internal query representation.

Query optimizer 120 generates one or more different candidate executionplans for a query, which are evaluated by query optimizer 120 todetermine which should be used to compute the query. For query QS, queryoptimizer 120 generates candidate execution plans P₁, P₂ through P_(N).

Execution plans may be represented by a graph of interlinked nodes,referred to herein as operators, that each corresponds to a step of anexecution plan, referred to herein as an execution plan operation. Thehierarchy of the graphs represents the order in which the execution planoperations are performed and how data flows between each of theexecution plan operations. Execution plan operations include, forexample, a table scan, an index scan, hash-join, sort-merge join,nested-loop join, and filter.

Query optimizer 120 may optimize a query by transforming the query. Ingeneral, transforming a query involves rewriting a query into anotherquery that produces the same result and that can potentially be executedmore efficiently, i.e. one for which a potentially more efficient andless costly execution plan can be generated. Examples of querytransformation include view merging, subquery unnesting, predicatemove-around and pushdown, common subexpression elimination,outer-to-inner join conversion, materialized view rewrite, startransformation, and, importantly, join predicate push down. A query isrewritten by manipulating a deep copy of the query representation toform a transformed query representation representing a transformedquery. The query as transformed is referred to herein as the transformedquery; the query transformed is referred to as the base query.

Query optimizer 120 may perform more than one transformation forevaluation. Each transformed query generated for a query is referred toas candidate transformed query. For query QS, query optimizer 120generates candidate transformed queries T₁, T₂ . . . T_(N). Atransformed query rewritten to generate another transformed query isreferred to herein as a base query for the other transformed query. Thequery originally received by the query optimizer 120 is referred to asthe original query.

The original query an optimizer optimizes (e.g. query QS) and thealternate transformed queries generated for the query are referred toindividually as a candidate query and collectively as the query searchspace The one or more candidate execution plans generated for each queryin the query search space are collectively referred to as the plansearch space. The query search space generated by query optimizer 120for query statement QS includes transformations T₁, T₂ . . . T_(N) andquery QS; the plan search space comprises P₁, P₂ . . . P_(N).

The query search space and the plan search space are collectivelyreferred to herein as the search space. Thus, a search space may containone or more candidate execution plans for one or more candidate queries,and/or one or more candidate execution plans for candidate queries,including candidate transformed queries.

Cost Estimation

To evaluate the candidate execution plans in the search space, queryoptimizer 120 estimates a cost of each candidate execution plan andcompares the estimated query costs to select an execution plan forexecution. In an embodiment, the estimated query cost is generated by aquery cost estimator 130, which may be a component of query optimizer120. For a plan P_(i) supplied by query optimizer 120, cost estimator130 computes and generates an estimated query cost E_(i). In general,the estimated query cost represents an estimate of computer resourcesexpended to execute an execution plan. The estimated cost may berepresented as the execution time required to execute an execution plan.To determine which candidate execution plan in the search space toexecute, query optimizer 120 may select the candidate execution planwith the lowest estimated cost.

Join Predicate Push-Down

In a join predicate pushdown, a join predicate from an outer query thatreferences a column of a view of an outer query is “pushed down” into aview. To qualify for join predicate pushdown, the join predicate mustsatisfy one or more criteria. At a minimum, a join predicate belongingto an outer query should reference both a column of the view and acolumn of a table listed in the FROM clause of the outer query. Variousembodiments may require other criteria. Join predicate pushdown isillustrated with the following base query QA.

QA = SELECT T1.C, T2.x FROM T1, T2, (SELECT T4.x, T3.y  FROM T4, T3 WHERE T3.p = T4.q and T4.k > 4) V WHERE T1.c = T2.d and T1.x = V.x (+)and  T2.d = V.y(+);

Query QA includes view V. V is the alias or label for the subqueryexpression (SELECT T4.x, T3.y FROM T4, T3 WHERE T3.p=T4.q and T4.k>4).The subquery expression is referred to herein as a view because it is asubquery expression among an outer query's FROM list items and can betreated, to a degree, like a view or table. Other tables listed in theFROM list are referred to herein as outer tables with respect to theouter query and/or the view. With respect to the view V, Tables T1 andT2 are outer tables, while tables T3 and T4 are not.

Under join predicate pushdown, query QA is transformed to query QA′ asfollows.

QA′= SELECT T1.C, T2.x FROM T1, T2, (SELECT T4.x, T3.y  FROM T4, T3 WHERE T3.p = T4.q and T4.k > 4 and  T1.x = T4.x and T2.f = T3.y) VWHERE T1.c = T2.d;

The join predicate T1.x=V.x (+) of the outer query is pushed down intoview by rewriting the view V to include the join predicate T1.x=T4.x.T4.x is the equivalent column of the view V.x. Similarly, the joinpredicate T2.f=V.y (+) is pushed down into the view. The pushed downjoin predicates do not specify outer-join notation; the outer-join isinternally represented by the table being outer-joined.

A pushed-down predicate opens up new access paths, which are exploitedto form candidate execution plans that may more efficiently compute aquery. For example, a candidate execution plan may compute the joinbased on join predicate T2. d=T3.y in QA′ using an index on either T2.for T3.y in an index nested-loops join, which is not possible withoutthis transformation.

A query's view may be a UNION of multiple subquery branches. Such aquery may be transformed using join predicate push-down by pushing downmultiple join predicates, as illustrated with the following base queryQB.

QB= SELECT T1.C, T2.x FROM T1, T2,  (SELECT T4.x AS X, T3.y AS Y FROMT4, T3 WHERE T3.p = T4.q and T4.k > 4 UNION ALL SELECT T5.a AS X, T6.bAS Y FROM T5, T6 WHERE T5.m = T6.n) V WHERE T1.c = T2.d and T1.x = V.xand T2.f = V.y;

The query QB has been transformed into QB′ where qualifying joinpredicates of the outer query have been pushed down inside each branchof the UNION ALL view.

QB′= SELECT T1.C, T2.x FROM T1, T2,  (SELECT T4.x, T3.y FROM T4, T3WHERE T3.p = T4.q and T4.k > 4 and  T1.x = T4.x and T2.f = T3.y UNIONALL SELECT T5.a, T6.b FROM T5, T6 WHERE T5.m = T6.n and  T1.x = T5.a andT2.f = T6.b) V WHERE T1.c = T2.d;

The outer query join predicates T1.x=V.x and T2.f=V.y have been pusheddown into the first branch of the view as T1.x=T4.x and T2.f=T3.y,respectively, and into the second branch as T1.x=T5.a and T2.f=T6.b,respectively. Again join predicate push-down opens up new access paths,and therefore allows the view to be joined with outer tables usingindex-based nested-loop join.

Join Predicate Push Down for GROUP BY Views

According to an embodiment, novel types of join predicate pushdowntransformations may be performed with different types of views. One suchview is a GROUP BY view, which is illustrated by the following query QG.

QG = SELECT T1.z, V.x, V.vsum FROM T1, T2, (SELECT T4.x, SUM(T3.y) ASvsum FROM T4, T3 WHERE T3.p = T4.q GROUP BY T4.x) V WHERE T1.c = T2.dand V.vsum > 10 and T1.y = V.x;

Under join predicate pushdown for GROUP BY views, join predicates thatreference a non-aggregation SELECT list item of a view are pushed downinto the view. In query QG, the join predicate query text T1.y=V.xreferences a non-aggregation SELECT list item V.x of online view V.Thus, this predicate may be pushed down into the view. As a furtheroptimization, if all group-by non-aggregate items participate in joinpredicates with the view, these join predicates are equi-joins, andthese join predicates are pushed down into the view, then the expensiveGROUP BY operator may be removed from the view. This removal is validbecause correlation on equality conditions acts as a grouping on thosecolumn values. Thus, query QG may be transformed into query QG′, asfollows.

QG′= SELECT T1.z, T1.y, V.vsum FROM T1, T2, (SELECT SUM (T3.y) AS vsumFROM T4, T3 WHERE T3.p = T4.q and T1.y = T4.x) V WHERE T1.c = T2.d andV.vsum > 10;

The join predicate T1.y=V.x is pushed down into the view as T1.y=T4.x,in which V.x is substituted with its equivalent T4.x, respectively.Since a join predicate is on every item of the GROUP BY list (i.e. T4.x)in query QG, the GROUP BY operator has been removed from QG′.

Join Predicate Push Down for DISTINCT Views

Another type of view subject to join predicate push-down is a DISTINCTview, which specifies a DISTINCT operator or a variant in the SELECTclause. The following query QD illustrates a DISTINCT view according toan embodiment of the present invention.

QD = SELECT T1.C, T2.x FROM T1, T2, (SELECT DISTINCT T4.x, T3.y FROM T4,T3 WHERE T3.p = T4.q) V WHERE T1.c = T2.d and T1.x = V.x and T2.f = V.y;

In the join predicate pushdown, one or more join predicates of the outerquery are pushed down into a DISTINCT view. If the join predicates withthe view are equi-join on all the SELECT items of the view and all thesejoin predicates are pushed down into the view, then an additionaloptimization can be done by removing the expensive DISTINCT operator andperforming nested-loop semi-join rather than nested-loop join for joinsinvolving the SELECT list items. Accordingly, query QD may betransformed into the following transformed query QD′.

QD′= SELECT T1.C, T2.x FROM T1, T2, (SELECT T4.x, T3.y FROM T4, T3 WHERET3.p = T4.q and T1.x = T4.x and T2.f = T3.y) V WHERE T1.c = T2.d;

QD has been transformed into QD′. The outer query join predicatesT1.x=V.x and T2.f=V.y are pushed into view V as T3.p=T4.q and T1.x=T4.x,respectively. The DISTINCT operator has been removed from Q4. The joinbased on join predicate T1.x=T4.x and the join based on join predicateT2.f=T3.y are performed using a nested-loops semi join rather thannested-loops join. While QG′ does not represent these semi-joinoperations explicitly, a candidate execution plan for QD′ implementsthese equi-joins as a nested-loop semi-join.

Join Predicate Push-Down for Anti-/Semi-Joined Views

Another type of view subject to join predicate push-down is ananti-/semi-joined view, in which the join predicate that is “pusheddown” into a view specifies an anti-join and semi-join operation. Suchviews are generated by, for example, subquery unnesting or MINUS-to-joinconversion. Note that, unlike outer-join, a view or a table may beanti-/semi-joined with more than one table. In these cases, there ismore than one left table. A left table refers to the table whose rowsare returned for an anti-/semi-join operation. This partial orderinformation is maintained when there is more than one left table on theleft.

When an anti-/semi-joined view generated by subquery unnesting undergoesjoin predicate push-down transformation, the evaluation of such views isakin to the filter evaluation of the original subquery form but with onesignificant difference—the views in which the predicates are pushed intocan be evaluated at any point in the join order as long as the partialorder is imposed, unlike subquery filter evaluation that generally takesplace at the end of all join evaluation.

The following query QA is used to illustrate an anti-joined view that isgenerated using subquery unnesting and then subsequently transformedusing join predicate push-down.

QA = SELECT T1.c, T2.x FROM T1, T2 WHERE T1.c = T2.d and NOT EXISTS(SELECT 1  FROM T4, T3  WHERE T3.p = T4.q and T2.y = T3.y);

Subquery unnesting produces the following query QA′ with anti-joinedview V. Note that the anti-join operator A=is non-standard SQL and isused here for the purpose of illustration only.

QA′= SELECT T1.c, T2.x FROM T1, T2, (SELECT T3.y) FROM T4, T3 WHERE T3.p= T4.q) V WHERE T1.c = T2.d and T2.y A= v.y;

Join predicate pushdown transformation of QA′ produces the followingquery QA″, in which the view has undergone join predicate pushdown.

QA″= SELECT T1.c, T2.x FROM T1, T2, (SELECT T3.y FROM T4, T3 WHERE T3.p= T4.q and T2.y = T3.y) V WHERE T1.c = T2.d;

In QA″, the anti-join is internally represented and is not shown. Whencomputed, QA″ allows the following join orders: (T1, T2, V), (T2, T1,V), (T2, V, T1). QA, on the other hand, normally allows only the twojoin orders: (T1, T2, S) and (T2, T1, S), where S represents thesubquery.

The following query QS is used to illustrate an semi-joined view that isgenerated using subquery unnesting and then subsequently transformedusing join predicate push-down.

QS = SELECT T1.c, T2.x FROM T1, T2 WHERE T1.c = T2.d and EXISTS (SELECT1  FROM T4, T3  WHERE T3.p = T4.q and T2.y = T3.y);

Subquery unnesting produces the following query QS′ with semi-joinedview V. Note that the semi-join operator S=is non-standard SQL and isused here for the purpose of illustration only.

QS′= SELECT T1.c, T2.x FROM T1, T2, (SELECT T3.y FROM T4, T3 WHERE T3.p= T4.q) V WHERE T1.c = T2.d and T2.y S= V.y;

Join predicate push-down transformation of QS′ produces the followingquery QS″, in which the view has undergone join predicate push-down.

QS″= SELECT T1.c, T2.x FROM T1, T2, (SELECT T3.y FROM T4, T3 WHERE T3.p= T4.q and T2.y = T3.y) V WHERE T1.c = T2.d;

In QS″, the semi-join is internally represented and is not shown there.When computed, QS″ allows the following join orders: (T1, T2, V), (T2,T1, V), (T2, V, T1). QS, on the other hand, normally allows only the twojoin orders: (T1, T2, S) and (T2, T1, S), where S represents thesubquery.

Join Predicate Push-Down for Multi-Level Queries

A view may contain another view, the latter being referred to herein asa nested view. A view that contains a nested view is referred to hereinas a multi-level view. A multi-level view is illustrated by thefollowing query QM.

QM= SELECT T1.C, T2.x FROM T1, T2, (SELECT T4.x, V2.k FROM T4, T3,(SELECT T5.y, T6.k FROM T5, T6 WHERE T5.d = T6.d) V2 WHERE T4.r = V2.y(+) and T3.p = T4.q) V1 WHERE T1.c = T2.d and T1.x = V1.k (+);

In QM, the nested view is V2. Under join predicate push down, joinpredicates may be pushed down to the lowest nested view of a multi-levelview. In fact, to open an access path, such as an index access path, ajoin predicate should be pushed down to the lowest level nested view,which in the case of query QM is V2. Under join predicate push down, QMmay be transformed to QM′ as follows.

QM′= SELECT T1.C, T2.x FROM T1, T2, (SELECT T4.x, V2.k FROM T4, T3,(SELECT T5.y, T6.k FROM T5, T6 WHERE T5.d = T6.d  and T1.x = T6.k) V2WHERE T4.r = V2.y (+) and T3.p = T4.q)V2 WHERE T1.c = T2.d;

The join predicate T1.x=V1.k (+) has been pushed to the nested view V2as join predicate T1.x=T6.k.

Join Predicate Push-Down Applicable to Various Kinds of Views

Embodiments of the invention have been illustrated by pushing down joinpredicates into views that are inline views. A view is a query and/orsubquery that may be referenced by a label associated with the view asif the view is a table. The query or subquery of a view is referred toherein as the view's definition. For an inline view in a query, asubquery in the query is the view's definition and the alias assigned bythe query to the subquery is the label for the subquery. For example,Query QA includes inline view V.V is the alias and label for thesubquery (SELECT T4.x, T3.y FROM T4, T3 WHERE T3.p=T4.q and T4.k>4).

Views may also be defined by a database system, using for example, DataDefinition Language commands. Once defined, subsequent queries may referto the views as if the views are tables defined by the database system;the queries do contain the views' definition, rather, the databasesystem metadata holds the view's definition. When a query referencing aview is received by a database system, the view in the query is ineffect replaced with the view's definition. The techniques described inhere may then be applied to the view's definition.

Search Space Strategies

In general, when determining how to optimize a query, query optimizer120 determines a set of query transformations to generate for the querysearch space. The estimated query cost of each query in the query searchspace is then computed and compared. A query may contain multiple viewsinto which a join predicate may be pushed. As a result, there may bemany join predicate push-down transformations that can be performed.Determining and generating a transformation and estimating its queryexecution cost consumes computer resources; doing these for all or evena proportion of all possible join predicate push-down transformationsfor an original query may create a cost that is significant compared tothe cost of executing the original query, if not more. Thus, to optimizethe cost of query optimization, “transformation search space strategies”are used to select candidate join predicate push-down transformations inorder to limit the size of the query search space and the cost of queryoptimization.

Such transformation search space strategies may be based, at least inpart, on heuristics. Heuristics are rules that specify conditions underwhich a certain type of transformation is or is not performed, and arebased on assumptions that are generally true but may not be true for aparticular base query. An example of a heuristic is to push down a joinpredicate only if it opens an index access path. That is, when pusheddown into a view, the join predicate references an indexed column. Theunderlying assumption for this heuristic is that transformation underthese circumstances causes an index-based nested-loops join for thepushed down predicate. In the case of a query with multiple possiblejoin predicates that may be pushed down, a transformation that pushesdown one of the possible join predicates is only generated for the querysearch space if the join predicate opens up an index access path.

Search space strategies also include query search space generationprocedures that systematically generate combinations join predicatepush-down transformations. With the exception of one, the query searchspace generation procedures generate some but not all possible joinpredicate push-down transformations. Such procedures are illustratedusing the following query QS:

QS = SELECT T1.C, FROM T1, (SELECT T2.x as x  FROM T2, T3  WHERE T2.z =T3.z) V1, (SELECT T4.y as y  FROM T4, T5  WHERE T4.z = T5.z) V2 WHERET1.x = V1.x (+) and   T1.y = V2.y(+)

One approach, the “exhaustive approach”, considers the cost of everypossibility for join predicate pushdown transformation. Thus, every joinpredicate push-down transformation is included in the query searchspace. In the case of QS, there are four possibilities for joinpredicate push-down transformations, as follows:

-   -   QS00—no predicate is pushed down.    -   QS10—T1.x=V.x(+) is pushed down into V1.    -   QS01—T1.y=V.y(+) is pushed down into V2.    -   QS11—Both predicates are pushed down as above.

Under the exhaustive approach, a query execution cost for eachtransformation is generated. The transformation with the least cost isselected.

Another approach, the two-pass approach, generates a candidate querythat pushes down each qualifying join predicate and compares the querycost to a base query in which none have been pushed down. Under thetwo-pass approach, of the four possibilities only candidate queries QS₀₀and QS₁₁—none and all—are in the query search space. Query costs forboth are estimated and compared.

Under the linear approach, each join predicate that can be pushed downis considered in turn. Specifically, for each join predicate, a joinpredicate push-down transformation involving the join predicate isperformed and a query cost is generated for the resulting transformedquery. This query cost is compared to the query cost generated for thepreviously evaluated join predicate. In the case of the first joinpredicate considered, its query cost is compared to the cost of thequery without any join predicate push down. If the query cost islowered, then a decision is made to push down that join predicate. Thejoin predicate is pushed down in any subsequent join predicate push-downevaluated under the linear approach.

For example, under the linear approach, the query cost of the QS₁₀ iscomputed and compared to that of QS₀₀. Since the cost of QS₁₀ is lessthan that of QS₀₀, it is determined that the predicate pushed down forQS₁₀, T1.x=V.x (+), is pushed down in subsequent transformationsevaluated under the linear approach. Accordingly, in the next iterationin which the next join predicate T1.y=V.y (+) is considered, thetransformed query generated is QS₁₁, which pushes down both predicates.Note, QS₀₁ is never considered under the linear approach. Further, as aresult of evaluating the transformation for QS₁₀, QS₀₁ was excluded fromquery search space while QS₁₁, was not. Thus, the decision to considerand undertake a transformation depended on the decision regardinganother.

Hardware Overview

FIG. 4 is a block diagram that illustrates a computer system 400 uponwhich an embodiment of the invention may be implemented. Computer system400 includes a bus 402 or other communication mechanism forcommunicating information, and a processor 404 coupled with bus 402 forprocessing information. Computer system 400 also includes a main memory406, such as a random access memory (RAM) or other dynamic storagedevice, coupled to bus 402 for storing information and instructions tobe executed by processor 404. Main memory 406 also may be used forstoring temporary variables or other intermediate information duringexecution of instructions to be executed by processor 404. Computersystem 400 further includes a read only memory (ROM) 408 or other staticstorage device coupled to bus 402 for storing static information andinstructions for processor 404. A storage device 410, such as a magneticdisk or optical disk, is provided and coupled to bus 402 for storinginformation and instructions.

Computer system 400 may be coupled via bus 402 to a display 412, such asa cathode ray tube (CRT), for displaying information to a computer user.An input device 414, including alphanumeric and other keys, is coupledto bus 402 for communicating information and command selections toprocessor 404. Another type of user input device is cursor control 416,such as a mouse, a trackball, or cursor direction keys for communicatingdirection information and command selections to processor 404 and forcontrolling cursor movement on display 412. This input device typicallyhas two degrees of freedom in two axes, a first axis (e.g., x) and asecond axis (e.g., y), that allows the device to specify positions in aplane.

The invention is related to the use of computer system 400 forimplementing the techniques described herein. According to oneembodiment of the invention, those techniques are performed by computersystem 400 in response to processor 404 executing one or more sequencesof one or more instructions contained in main memory 406. Suchinstructions may be read into main memory 406 from anothermachine-readable medium, such as storage device 410. Execution of thesequences of instructions contained in main memory 406 causes processor404 to perform the process steps described herein. In alternativeembodiments, hard-wired circuitry may be used in place of or incombination with software instructions to implement the invention. Thus,embodiments of the invention are not limited to any specific combinationof hardware circuitry and software.

The term “machine-readable medium” as used herein refers to any mediumthat participates in providing data that causes a machine to operationin a specific fashion. In an embodiment implemented using computersystem 400, various machine-readable media are involved, for example, inproviding instructions to processor 404 for execution. Such a medium maytake many forms, including but not limited to, non-volatile media,volatile media, and transmission media. Non-volatile media includes, forexample, optical or magnetic disks, such as storage device 410. Volatilemedia includes dynamic memory, such as main memory 406. Transmissionmedia includes coaxial cables, copper wire and fiber optics, includingthe wires that comprise bus 402. Transmission media can also take theform of acoustic or light waves, such as those generated duringradio-wave and infra-red data communications. All such media must betangible to enable the instructions carried by the media to be detectedby a physical mechanism that reads the instructions into a machine.

Common forms of machine-readable media include, for example, a floppydisk, a flexible disk, hard disk, magnetic tape, or any other magneticmedium, a CD-ROM, any other optical medium, punchcards, papertape, anyother physical medium with patterns of holes, a RAM, a PROM, and EPROM,a FLASH-EPROM, any other memory chip or cartridge, a carrier wave asdescribed hereinafter, or any other medium from which a computer canread.

Various forms of machine-readable media may be involved in carrying oneor more sequences of one or more instructions to processor 404 forexecution. For example, the instructions may initially be carried on amagnetic disk of a remote computer. The remote computer can load theinstructions into its dynamic memory and send the instructions over atelephone line using a modem. A modem local to computer system 400 canreceive the data on the telephone line and use an infra-red transmitterto convert the data to an infra-red signal. An infra-red detector canreceive the data carried in the infra-red signal and appropriatecircuitry can place the data on bus 402. Bus 402 carries the data tomain memory 406, from which processor 404 retrieves and executes theinstructions. The instructions received by main memory 406 mayoptionally be stored on storage device 410 either before or afterexecution by processor 404.

Computer system 400 also includes a communication interface 418 coupledto bus 402. Communication interface 418 provides a two-way datacommunication coupling to a network link 420 that is connected to alocal network 422. For example, communication interface 418 may be anintegrated services digital network (ISDN) card or a modem to provide adata communication connection to a corresponding type of telephone line.As another example, communication interface 418 may be a local areanetwork (LAN) card to provide a data communication connection to acompatible LAN. Wireless links may also be implemented. In any suchimplementation, communication interface 418 sends and receiveselectrical, electromagnetic or optical signals that carry digital datastreams representing various types of information.

Network link 420 typically provides data communication through one ormore networks to other data devices. For example, network link 420 mayprovide a connection through local network 422 to a host computer 424 orto data equipment operated by an Internet Service Provider (ISP) 426.ISP 426 in turn provides data communication services through the worldwide packet data communication network now commonly referred to as the“Internet” 428. Local network 422 and Internet 428 both use electrical,electromagnetic or optical signals that carry digital data streams. Thesignals through the various networks and the signals on network link 420and through communication interface 418, which carry the digital data toand from computer system 400, are exemplary forms of carrier wavestransporting the information.

Computer system 400 can send messages and receive data, includingprogram code, through the network(s), network link 420 and communicationinterface 418. In the Internet example, a server 430 might transmit arequested code for an application program through Internet 428, ISP 426,local network 422 and communication interface 418.

The received code may be executed by processor 404 as it is received,and/or stored in storage device 410, or other non-volatile storage forlater execution. In this manner, computer system 400 may obtainapplication code in the form of a carrier wave.

In the foregoing specification, embodiments of the invention have beendescribed with reference to numerous specific details that may vary fromimplementation to implementation. Thus, the sole and exclusive indicatorof what is the invention, and is intended by the applicants to be theinvention, is the set of claims that issue from this application, in thespecific form in which such claims issue, including any subsequentcorrection. Any definitions expressly set forth herein for termscontained in such claims shall govern the meaning of such terms as usedin the claims. Hence, no limitation, element, property, feature,advantage or attribute that is not expressly recited in a claim shouldlimit the scope of such claim in any way. The specification and drawingsare, accordingly, to be regarded in an illustrative rather than arestrictive sense.

1. A computer-implemented method, comprising: generating a transformedquery based on a particular query, wherein said particular queryincludes: an outer query; a view within a FROM list of the outer query;a join predicate that references: a column of an outer table of theouter query, and a column returned by the view; wherein said viewincludes: a GROUP BY operator that references a certain column uponwhich the column returned by the view is based, or a DISTINCT operatorthat references a certain column upon which the column returned by viewis based; and wherein generating the transformed query includes pushingdown the join predicate to create a pushed down join predicate thatreferences the column of the outer table and a certain column returnedby the view is based.
 2. The computer-implemented method of claim 1,wherein the step of generating a transformed query includes removing theGROUP BY operator from the view.
 3. The computer-implemented method ofclaim 1, wherein the step of generating a transformed query includesremoving the DISTINCT operator from the view.
 4. A computer-implementedmethod, comprising: generating a transformed query based on a particularquery, wherein said particular query includes: an outer query; a viewwithin a FROM list of the outer query; a join predicate that references:a column of an outer table of the outer query, and a column returned bythe view; wherein generating the transformed query includes pushing downthe join predicate to create a pushed down join predicate thatreferences the column of the outer table and a certain column returnedby the view; generating an estimated query execution cost for each of aset of candidate queries that includes said particular query and saidtransformed query; and selecting as an optimized query for saidparticular query a candidate query of said candidate queries.
 5. Acomputer-implemented method, comprising: generating a transformed querybased on a particular query, wherein said particular query includes: anouter query; a view within a FROM list of the outer query; a certainpredicate for an anti-join, said certain predicate referencing: a columnof an outer table of the outer query, and a column returned by the view;wherein generating the transformed query includes pushing down thecertain predicate to create a pushed down join predicate that referencesthe column of the outer table and a certain column returned by the view;generating an estimated query execution cost for each of a set ofcandidate queries that includes said particular query and saidtransformed query; and selecting as an optimized query for saidparticular query a candidate query of said candidate queries.
 6. Acomputer-implemented method, comprising: generating a transformed querybased on a particular query, wherein said particular query includes: anouter query; a view within a FROM list of the outer query; a certainpredicate for a semi-join that references: a column of an outer table ofthe outer query, and a column returned by the view; wherein generatingthe transformed query includes pushing down the certain predicate tocreate a pushed down join predicate that references the column of theouter table and a certain column returned by the view; generating anestimated query execution cost for each of a set of candidate queriesthat includes said particular query and said transformed query; andselecting as an optimized query for said particular query a candidatequery of said candidate queries.
 7. A computer-implemented method,comprising: a database system generating a search space for optimizing aparticular query, wherein said particular query includes: an outerquery; a plurality of views in a FROM list of the outer query; for eachview of said plurality of views, a certain join predicate of the outerquery that references: a column of an outer table of the outer query,and a column returned by said each view; wherein the step of generatinga search space includes generating a search space that includes one ormore query transformations that each involve pushing down the respectivecertain join predicate into at least one view of said plurality ofviews, said search space including said particular query; and selectingan optimized query from among the search space based on query executioncosts estimated for the queries in said search space.
 8. Thecomputer-implemented method of claim 7, wherein the step of generating asearch space includes performing in an order for each view of saidplurality of views, certain steps of: generating a transformed querythat pushes down the respective certain join predicate of said eachview; estimating a query execution cost for said the transformed query;making a determination of whether the estimated query execution cost islower than a previous estimated query execution cost; and wherein saidtransformed query pushes down the respective certain join predicate ofany view of said plurality of views for which a previous determinationwas made that the query execution is lower than a previous estimatedquery execution cost.
 9. The computer-implemented method of claim 7,wherein generating a search space includes generating a search spacethat: includes a query wherein, for each view of said plurality ofviews, the respective certain join predicate is not pushed down; and forany query in the search space for which a certain join predicate of aparticular view of the plurality of views is pushed in the particularview, includes a transformed query in which for each view of saidplurality of views, the respective certain join predicate is pushed downinto said each view if one or more criteria are satisfied.
 10. Thecomputer-implemented method of claim 7, wherein generating a searchspace includes generating a search space that includes a transformedquery for every combination of join predicate push downs that can beperformed with respect to said plurality of views.
 11. Thecomputer-implemented method of claim 7, wherein the step of generating asearch space that includes one or more query transformations that eachinvolve pushing down the respective certain join predicate into at leastone view of said plurality of views includes generating a search spacethat includes one or more query transformations that push down a joinpredicate only if one or more criteria are satisfied, said one or morecriteria including that said pushed down predicate open an index accesspath for a view of said plurality of views.
 12. A computer-implementedmethod, comprising: generating a search space for a particular query,wherein said particular query includes: an outer query; a first viewwithin the FROM list of the outer query; a first join predicate of theouter query that references: a column of an outer table of the outerquery, and a column returned by said first view; a second view withinthe FROM list of the first view; a second join predicate of the secondview that references a column returned by said first view; generating asearch space that includes one or more query transformations that eachinvolve pushing down a join predicate of the outer query to the secondview; and selecting an optimized query from among the search space. 13.A computer-implemented method, comprising: generating a transformedquery by creating a view in the transformed query; wherein saidtransformed query includes: an outer query with a FROM list thatincludes said view; and a join predicate of the outer query thatreferences: a column of an outer table of the outer query, and a columnreturned by said view; generating a second transformed query byperforming steps that include pushing down the join predicate into theview to create a pushed down join predicate that references the columnof the outer table and a certain column upon which the column returnedby the view is based. selecting an optimized query from among the searchspace.
 14. A computer-readable medium carrying one or more sequences ofinstructions which, when executed by one or more processors, causes theone or more processors to perform the method recited in claim
 1. 15. Acomputer-readable medium carrying one or more sequences of instructionswhich, when executed by one or more processors, causes the one or moreprocessors to perform the method recited in claim
 2. 16. Acomputer-readable medium carrying one or more sequences of instructionswhich, when executed by one or more processors, causes the one or moreprocessors to perform the method recited in claim
 3. 17. Acomputer-readable medium carrying one or more sequences of instructionswhich, when executed by one or more processors, causes the one or moreprocessors to perform the method recited in claim
 4. 18. Acomputer-readable medium carrying one or more sequences of instructionswhich, when executed by one or more processors, causes the one or moreprocessors to perform the method recited in claim
 5. 19. Acomputer-readable medium carrying one or more sequences of instructionswhich, when executed by one or more processors, causes the one or moreprocessors to perform the method recited in claim
 6. 20. Acomputer-readable medium carrying one or more sequences of instructionswhich, when executed by one or more processors, causes the one or moreprocessors to perform the method recited in claim
 7. 21. Acomputer-readable medium carrying one or more sequences of instructionswhich, when executed by one or more processors, causes the one or moreprocessors to perform the method recited in claim
 8. 22. Acomputer-readable medium carrying one or more sequences of instructionswhich, when executed by one or more processors, causes the one or moreprocessors to perform the method recited in claim
 9. 23. Acomputer-readable medium carrying one or more sequences of instructionswhich, when executed by one or more processors, causes the one or moreprocessors to perform the method recited in claim
 10. 24. Acomputer-readable medium carrying one or more sequences of instructionswhich, when executed by one or more processors, causes the one or moreprocessors to perform the method recited in claim
 11. 25. Acomputer-readable medium carrying one or more sequences of instructionswhich, when executed by one or more processors, causes the one or moreprocessors to perform the method recited in claim
 12. 26. Acomputer-readable medium carrying one or more sequences of instructionswhich, when executed by one or more processors, causes the one or moreprocessors to perform the method recited in claim 13.